Skip to content

feat: pymat.search(query) — fuzzy domain-library search#86

Merged
gerchowl merged 2 commits intomainfrom
feature/84-pymat-search
Apr 19, 2026
Merged

feat: pymat.search(query) — fuzzy domain-library search#86
gerchowl merged 2 commits intomainfrom
feature/84-pymat-search

Conversation

@gerchowl
Copy link
Copy Markdown
Contributor

Summary

Adds a top-level fuzzy-find verb over the py-mat material registry. Symmetric with pymat.vis.search(...) on the visual side.

Closes #85.

Design

  • Tokenized, conjunctive matching: every whitespace-separated query token must match at least one weighted target.
  • Targets: registry key (weight 10), Material.name / grade (5), hierarchy parent names (3).
  • Tie-break: shorter registry key first — parents rank above longer-keyed descendants.
  • Triggers load_all() for exhaustive results. Case-insensitive.
  • Returns list[Material] sorted best-first, truncated to limit=.

Test coverage

30 new tests in tests/test_search.py:

  • Public surface + __all__ membership
  • Empty / whitespace query → []
  • Exact key match ranks first (incl. case-insensitive)
  • Name substring hits
  • Grade match
  • Conjunctive multi-token (a miss on any token rejects)
  • Tokens spanning name + hierarchy (e.g. "lyso saint")
  • Key beats name when both match
  • Shorter-key tiebreak (parent before descendant)
  • limit= truncates after ranking, respects ordering
  • Default limit = 10, limit=0[]
  • Triggers category loading for exhaustive results
  • Internal _targets / _score unit tests pinning weights + logic
  • Determinism (same query → same order)
  • Regression guards for realistic queries

Semver

Minor bump: 3.2.1 → 3.3.0. New public API, zero existing-behavior changes.

Test plan

  • 329 tests pass locally (was 299; +30).
  • Version bumps propagated: pyproject.toml, src/pymat/__init__.py, .release-please-manifest.json, CHANGELOG.

3.3.0 minor bump — new public API; zero existing-behavior changes.

- pymat.search(query, *, limit=10) -> list[Material]
- Tokenized, conjunctive matching: every whitespace token must hit
  somewhere (registry key / name / grade / hierarchy parent name).
- Weighted targets: key (10) > name/grade (5) > hierarchy path (3).
- Tie-break: shorter registry key ranks first — parents land above
  longer-keyed descendants when both score the same.
- Triggers load_all() so results are exhaustive across categories.
- Complements pymat.vis.search() (visual catalog) with a symmetric
  domain-side verb — two axes, same verb, no namespace collision.

30 tests in tests/test_search.py pinning tokenization, weights,
conjunctive matching, tie-breaks, limit= semantics, load_all side
effect, determinism, and realistic queries ("stainless 316",
"lyso ce saint", etc.).
@gerchowl gerchowl merged commit bf9629a into main Apr 19, 2026
17 checks passed
@gerchowl gerchowl deleted the feature/84-pymat-search branch April 19, 2026 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pymat.search(query) — fuzzy search over the domain library (incl. grades)

1 participant